A Novel Minimum Divergence Approach to Robust Speaker Identification
نویسندگان
چکیده
In this work, a novel solution to the speaker identification problem is proposed through minimization of statistical divergences between the probability distribution (g) of feature vectors from the test utterance and the probability distributions of the feature vector corresponding to the speaker classes. This approach is made more robust to the presence of outliers, through the use of suitably modified versions of the standard divergence measures. The relevant solutions to the minimum distance methods are referred to as the minimum rescaled modified distance estimators (MRMDEs). Three measures were considered – the likelihood disparity, the Hellinger distance and Pearson’s chi-square distance. The proposed approach is motivated by the observation that, in the case of the likelihood disparity, when the empirical distribution function is used to estimate g, it becomes equivalent to maximum likelihood classification with Gaussian Mixture Models (GMMs) for speaker classes, a highly effective approach used, for example, by Reynolds [22] based on Mel Frequency Cepstral Coefficients (MFCCs) as features. Significant improvement in classification accuracy is observed under this approach on the benchmark speech corpus NTIMIT and a new bilingual speech corpus NISIS, with MFCC features, both in isolation and in combination with delta MFCC features. Moreover, the ubiquitous principal component transformation, by itself and in conjunction with the principle of classifier combination, is found to further enhance the performance.
منابع مشابه
Robust Estimation in Linear Regression Model: the Density Power Divergence Approach
The minimum density power divergence method provides a robust estimate in the face of a situation where the dataset includes a number of outlier data. In this study, we introduce and use a robust minimum density power divergence estimator to estimate the parameters of the linear regression model and then with some numerical examples of linear regression model, we show the robustness of this est...
متن کاملText Independent Speaker Identification with Finite Multivariate Generalized Gaussian Mixture Model with Distant Microphone Speech
An effective and efficient speaker Identification (SI) system requires a robust feature extraction module followed by a speaker modeling scheme for generalized representation of these features. In recent, years Speaker Identification has seen significant advancement, but improvements have tended to be bench marked on the near field speech, ignoring the more realistic setting of far field instru...
متن کاملRobust Controller Design Based-on Aerodynamic Load Simulator Identification Driven by PMSM for Hardware-in-the-Loop Simulations
Aerodynamic load simulators generate the required time varying load to test the actuator’s performance in the laboratory. Electric Load Simulator (ELS) as one of variety of the dynamic load simulators should follows the rotation of the Under Test Actuator (UTA) and applies the desired torque to UTA’s rotor at the same time. In such a situation, a very large torque is imposed to the ELS from the...
متن کاملA new SVM approach to speaker identification and verification using probabilistic distance kernels
One major SVM weakness has been the use of generic kernel functions to compute distances among data points. Polynomial, linear, and Gaussian are typical examples. They do not take full advantage of the inherent probability distributions of the data. Focusing on audio speaker identification and verification, we propose to explore the use of novel kernel functions that take full advantage of good...
متن کاملEfficient Text-Independent Speaker Identification using Optimized Hierarchical Mixture Clustering
Conventional Speaker Identification(SI) Systems uses individual Gaussian Mixture Models(GMM) for every speaker. If this method used for the large population Speaker identification systems, then during identification, likelihood computations between an unknown speaker's test feature vectors and speaker models has become a time-consuming process. This approach also increases the computationa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1512.05073 شماره
صفحات -
تاریخ انتشار 2015